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1^ IHTRGiyjCTIQy 

5hi3 report discueeed the retjuiremeiatc of a gan^raliz<id file 
manageroent system C^t IINCS (Laagiiaj?e information Network and Ciearicghouso 

Itie flow of language informa^ion with hooks, journ^ie, confereucas, 
and reports is clre&dy iuundatiag u5. >iX?.d it is getting wo'r<?e ev*»ry day* 
iJAsed cm realistic ncer-tem projecfione of the Center for Applied Llngviiicica., 
tiJfi iangtiag^^ corswuiuity will be f^ced. \?ith tJie accesaioa of almost ^50:000 
blblioarsphlc Items per ytiar. Thie n^jmber will julxx^st certainly 
grov* Alvhougli no single individual is expected to remain current with all 
the iteais, ths task of sorting out which itenis are indeed relevant to hl9 
interests faocomas increasingly onerous* 

Fortunately, concurr^^mt with the increasing flc;od of language inforoiHlio 
tre have been developing new techniques £or coping with irasaes of information. 
Tfea increased speed, decreased coat, lncr€a:3ed computational power, and decreafled 
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r8quireia5ai-.s of tim jnoaem digital computer &re ficslly tn-gln.jiag to 
realise tlift grandiose ptcmiiea of earlier years. But most percinentiy, c«v7 
• software nethoda ere emargir.g th.'it will fep powerful aic^a in procasslog the 
iofonaatioo and data. la paiticuUr, large amounts of data which rsoide ia 
flies, like these deta to be stored £or the lanRuajje co!nai.mi?y, can aw ba 
psocesaed much oore conveniently and efficiently. A data proceasing system 
• need not b« a Procrustean bed for the language information user. lo fact, it 
is prlnsarily for the convenience and efficiency of ths user that a syatetp. ahoald 
be dtseigaed. 

File jaea^-senect is the center of the data processing ays tern thet will 
aervo tb« users. ?ilc icansgecwjut simply refers to the fanctioas of storegs, 
ratriaval, updating, and queryJjsg of dat».; Clearly, the auccsaa of a coaputer- 
fc«acd infonaatlon syflteo for language will depend greatly on the quality of the 
file aia:«gement aysteta to he chosen. This report will consider the churacteriotic- 
of software packages, both ia existence and projected for th<i ac^jr fyture, that 
trill «ati8fy th« file tnanagetcftat requirements of LISCS. The coat of eoftvare 
baa frequeatiy bean underestimated. For exaowlc. In one large public project 
th* software vcb estimsted to coat railiioji. It sventually coat $26 million. 
Wiiie fchia exen^ie xmy be rather isoli^ted, it does auggeat the need for a 
r«i»liat.ic appraisal of the implicationa of system design drjcisionft on tlie tile 
odusgsmtnt scfti?ara peclcsj;e. 

■&1S URCS file aa..'.gemant ayatea vavnt support the storage and retrieval 
. o£ bibliographic references, subject terms, abstracts, and full text of technical 
docujwntat<4)n in the field of linguistic science. Furthermore, a file management 
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syatstn be required for rtstrieval of formsttea linguistic data, and for 
tJi« direction of LIKCS icseif, vith respect to coata, uoor sfatlstica, 6^c. 
A Ecatralized fiJc wansgejaeaK system ia a cotnputer-baaed syatera that «eparaJ:as 
tha Janguegc infomation uoer froio the mechanics of deta procesaiag. A 
gsceralized file mpageuient eyctetu smy fee considered for LIKCS because it 
off«r8 flexibility in input and output formats and se-aych stracegiea, end 
Jj«c8Ude It adspfca to chsngiug retiuirementB in a cost-effective meaner. 

%e t«7na data Baoasemnt is ujted here in its most generic Bcnne 
«n<3 en<:on»pa88es all autoaatic processing of data. Aa such, it includes the 
HeUa of gecey«liz56d data wanagorosnt, file lasBAaenisr.t 'iynwct, Information 
tttoras* «nd retrieval syateuw, and dpecific appltcatiou-ociented data proceoalng 
systeaai. The at-aaa of tecbnoio^y in date tMcagdment that apply epecifically to 
LINCS *alj. xnto two cstegorie^/: 

<l) a^ineraliged dataoanagerccint ays tenia tl-.al; accept rile 
dti^laition of cotapleK da'.a structures of a formtteu 
v^<:xLte., prepare dirocfcorxci and definition table.? of 
tajose data, and provide acceao to those data Sov uacra 
aoA prograaaera; and 

<2) Infonoation storage and retrieval syatetss that provide 
apodal sisch.^nicna for aligning the vocabulary of 
untutored users asking t the vocabulary of fhe in^iexsr? 
fccd 5v.thcrp of the docmnents in a rceereace-providing 
syefcan. 

Tcde report coaceatrr.too on tbe file man^igeinaat syiitem requirements for 
mm:?. »nd develops the ^;vn3.uation criteria applicable to generaliaed file aaRagemei^ 
cystoma pertinent to tiie UHCS problem, ll-.e report disruseea a hypothetical 
structural description cf a LIIICS system, the requiretnento for the LIWCS Keneralis^ 
file Banagament, the long-range trends tn-data mitnagement technology of inter«(Ot 
to tha UNCS problem, and the evaluation criteria applicable to a LIKCS file 
loanagement nyntcra. 
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1.2 pCKGSgJl^ 

TSi9 Center of AppUad LU<guiatic3 (CAL) has initiated « program to 
4«y<alop an information oyatsis for 'tbe languegs eciences. Thia progrm, 
d^«ci:rocd In t^t^a CAL p.-opos^l to the Uflktional Science Foundation,* iuvolv^g 
ft aimher of copc'^yreat efforts on the part of C/i and ita isubcoutrfictera. 
•Shft project, in th» syatem design pha»o, ©ncompassea two cot^cuTr«xlt s todies: 

(A) Advaaced Qtudiaa toward aa optiBiaX nystein coafiguration, and • 

(B) Studies of priority c^5aponsnt« for the tfyatsm^. 
Th« 9«caEu3 effort, ia turn, involvea two aub-areas: 

(?.i> IndoKing Tool Dev^Xopsifint, and 

(52} Syatftte Automation Studies. 
Ibia report cowr« Project E2, ays tea Mtgisatlon Studies, gjlo Mavt&.^ers^rit 
Fiirther backgroi»rxi ia contained In ths above reiereQced prcposal. 



* - C*Ii Proposal, ••An Infortnation System Program for the Language Sciences, 
3tflig« 2; Syatam Design." 



LTNC8 viXl 00 «n adaptive network of Individuais and orgRntr,aCion8 
tfcat irtll provi<i« products oud s^rvicea to facilitate the tr&xxaf.ev of itoayio^iv 
InfomatloQ, in a variety of media, aaiorts the scientific, c-aosmolty, fh& network 
will ](jre«uoucbly iaiXud« one or moro ce'ntraL clearisighoufies vlth c^naputer-baaecl 
f«cJlltie« and a variety of terminal code* cooaietlug of user csrgaalsatiops, 
iRfJivlducla, and ottker esciafclng specialized Infomatioti centurs. 

£allmi.^ elticients wust be defined isi the aCnictuiral dspcrlpcifyu 

of LISC3: 

<l> Tfjwa of dsCA to M handled and their gfcructursl organizufciou 
ulfchlR tb« ay a tern, 

(2) L«we.Xa o£ acccswlbauy to data provWsd f:hx*oiigh varlouo 
G^jtcciGtic stor&eo facilities, 

<3) • Waya la which Individuals a.iid othar a«tcnp.->£5d data canteva 
•v.'lll in^orest vlt-h LI£TCS data a»\d! network, 

<4) PuDitlon of autcwatic data prweseing hardware niitl sot'twcre. 

Tfe«09 Iteae ara dlccussed In tue sufaooquect paragraphs of. Section 2 

•oS fchl« report. la addition. Section 2 idaotlflcs potential alfcematft tacdafl 

■ of leyetem Interaction between TJKCS end external blbilographlc d«ta b«e«3, 
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The alternative nodes of system inu^actlon are defined according to the fol- 
lowing characteristics: 

(1) Btbliosraphic data eleaenta to be input, stored, and output, 

(2) Modes of user query, 

(3) Modes of responses to LIKCS clients. 
2 . 1 TYPES OF BIBLIOGRAPHIC^ DATA ELEHEKTS 

A library or bibliographic infonsatlon system deals with infomation 
about dcoiraenta (i.e., doctzaent surrogates), with Information contained in 
docunsnts, and with docuaents themselves. Therefore, the basic inputs and 
outputs to LIIWS will be various cotnbinatlons of bibliographic data eleecafis 
and documents. The possible inputs and outputs that may be utilized in an 
Interactive network are Indicated bolow: 

(1) Abstracts or ftjmatationa - these nay be either informative or 
indicative in kind, aaid may be assigned by an author or by a 
cooperating system, 

y^U-text docffleats - these tpay be either the original published 
form, a roprcduoed hard copy, microfom, or soae other n;5npaper 
representation. 

<3) S^^1:°S£^^iZ^^2^l£^X - t^ese consconly include (as appropriate) 
author, article title, Journal title, voluma, issue nuaber, 
pag:iaatlon, date, lcg>rlnt, report number, and source of document 
copy. It is also posolble to cite the place of publication of a 
doctirjent surrogate in a secoadary source, in addition to giving 
the prlraary citations. 

(4) Agce8g.lg. n_grcall nutters • these are P-condary notations used 
to identify a docisient (or document surrogate) in e particular 
infonaacion eyetsa. Under eerfcoin circuastancas such a DtKrtb.?r 
may be considered as part of a citation or as a substitute for 
the citc'iioa^ 

^5) lS^^M-^^JS,'^M3SL - tt^is cosRonly includes fubject ter=i3 
and classification mrafaers, and nay also irvolve personal and 
corporate author o^ss. 
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2olol ?0BBihl€t Catpl natiopti of Btbllogr a ph lc Data Blcv nants 

Various combiTiatloniJ of the previously defined bibllograpliic data 
elemonts could be used for Inputrlng in* •♦tn-i» i .n froas other data bases and in 
roopcnding to the queries of LINCS clientele. The relatiooehips between the 
data received avtd held by LISC3 nnd the responsea to its u^ers ciay now be defined, 

Table 1 displays and defines relationships between the possible types 
of UIEJSS responses to its clientele and the possible data elenent ccnibinations 
extracted and /or converts frca other bibliographic data bases and held by the 
system. Ten relationships have been identified. These relationships are 
indicated in coluain 1 by Rcsnan Ntcnerals. Celvomi 2 lists, for ecch relatioaehip 
type, a possible combination of aata eleoants suitable for response by LIKCS to 
clientele queries. Taere are four basic types of respcnsei* Tba third column 
lists 9 for each relationship type, a ccaibination of data el^^oents extracted 
froQ other data bases which may be held by LINGS* The fourth column lists 
those data elesients that must be input prccessod in scm: njanner to asake them 
compatible vi'A LIKCS holdingo. Tne last colum-i describee briefly the types 
of input processing required for each relationship tyye. 

Possible data elea:5«at respcnoes appear in colimn 2, Possible cosnbina- 
tlons of data eleaja^t holdings appesir in cclmn 3, and .possible coobinations 
of data element input prcccssics appear in colwans and 5. The ten relation- 
ship t3rpe8 (type I to tjrpe X) aro developed because of different possible 
relationships between the ccaabications, appsarins in the various other colmns, 
which constitute responses, holdings, and input processing. The basic vcriables 
are the rc3p0nr.es and the holdings; all other aspects of each relation<ihip type 
are derived therefran. 
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Table I thus octo forth the rclati ^ .alps between the poosible 
lespoases to clicntclQ (aa arrssult o£ querleo) QXii the possible dcta oleouent 
c<wi)tnatloa0 extracted by or converted by LI3CS £vcm other bijlio^raphlc data 
baeec. The ten rel^tlcnohlp tyr rt f jxtb. la Tsble 1 provide C2je baslo for 
deterssining all possible filt:orurU^i.vca for later-Bysteia interaction* 

Table 1 deals with seven dlffercrt typee of biblicsraphic data 

el^nento* 

<1) Ariaotatloan or abcfcracfts 

(2) Full-test doiixEsetit itms 

(3) Accscsloa or call ctari>er8 

(4) Citettonn, not Includirg accession or call nvacbers 

(5) ImleKlxig inforai&tion an nasi'^ncd by t';,a source data b^^ee, 
tv^lvding subjactt author, aad publishing aoixrce terms 

(6) Ccaranon oi: coirv^ 'ed incJexlng Inforjaation capable of being 
merged wit:h LTIICb'ow Rclf-orlsit^ated data base and ccarchable 
;slthln Litres' software and hardware 

(7) Scitrceo other than LlliCS froG wblch 2ull«te?ct dociarenta 
chculd be obCained 

^^^-^S 0? A(X;?^SSIBIUTY 

Any autcoaftlc data prrcesslns eyatena contains one or more central 
proceoooro and a nurfccr of various storage flevices. Storage dcvleeo dlffj^rt: 
with respect to the coDt of information stojraso, storage capacity^ access ttoe 
and laode of access. Tflicn a large voltsoe of information muat be Atored, the 
ecoTiOoiic effcctivenes.^ of the various levar.s of storage devices must be considered. 
An tGopcrtant aspect of the ByeKerx daalgn ^11 be to specify the hind of inforaa^ 
tion and the ar^orot of 3tor2:g3 required at each level of accessibility* The 
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n^U I. B2UX10HSHIPS BBTi^ESK KSSIBlJE i?y!l;32:!C3 SBSPORSES TO CLIEf!T2LB A>JD 

P0SS3I.S MTA EmiEHI CGT'SmTIOKS HELD BY LIHCS 



•Type 


Data Slotaeat Sasponsed to 
Clieatels Queries 


Data Eleoent 
Ccffibiriations HsW. 


! 

Data Slecttuts 
To Be Input Processed 


Isput Frocesaing 
Requirements 


r 

A. 


Atmotationa or abstrac o 
i^ith citations 2:ad call 
or accassioa nutaJ^ers, 

« 


Aa&ctations or ab«5tr«cta 
with citstiona and call 
or access ioa nucafaera, 
full'^te^^rt docuiueats^ and 
indexing iriforrsation as 
aflsi.cn<*d Tjv fiQii?*cfi . 


Sataa ao data alecient com- 
binations held by the 
system. 


Copying,, reproduction, 
character code trana- 
lation> record re- 
formatting (as required 
by physical laediiJEn, 

format of input and by 
selected tuode of query 
placement). 


II 


Same irasj^oi^i^s ttB for. 
relationship type I, 


Sacsa elat&ents held as for 
relationship type I, but 
indexing inforsiatic . is 
converted to be toerged 
with ©ddter data base. 


Sacae Input procecaing as 
for relationship type I, 

indexing information- 


Same processing as for 
type I, except for 

inforroation to permit 
merging with master 
data base. 


III 


Saflse rscponse As for 
relationship ty^s 1 & 11. 


Same elemants held as for 
relationship type I, but 
full^text docuoiente are 
not held. 


Satne as data al^oisnt ccqi<* 
biuatioas held by the 
system. 


Same ae for relation- : 
ship tj^pe I, except 1 
that reproduction of • 
full-text documents 
need not be considered. 

... . 4 




Saaui ra3ponse eleaients as 
ion^ relationship t>pe III. 


Ssme elemGuts h^id as for 
relationship typetll> but 
full-text docuaients are 
not held. 


Same eletaentv to be pro- 
cfesacd as for relationship 
type II, axcept that full- 
text document:* are not 
procesaed . 


Same as for relation- j 
ship type II, but re- • 
production of full- ; 
text documents need not j 
be considered . j 


V 


Cltaftioos ar*d cell or 
accacaioa nuctbers. 


Saoe elecianco held as for 
relationship typa !<, but 
annotations or abstracts 
are not' held. 


S& as data element com- 
biaaticns held by the 
systetD. 


Same as for relationshiri 
type I, except that 1 
annotations or abstract.*:: 
need not be processed. 
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?D3SI3.I£ liASA ELEMB5T CSQIH&TIOSS 35ELD BY LH^CS <505itiziue<ji) 



2?ata Slem^t Ssspcsses to 
Clientele Quaries 



Same rcepoase as ,for re« 
laticaship type V, 



Data 21eis3,at 
Ccci^J^ticas Held 



Sasie elejTfiats held a« for 
relet icnchlp type 11^ but 
annot&tlcca or abstracts 
are act held. 



Saiae respcn^e «3 fo5C r^- [S&nia as for relatiotzship 



lationshlps V & VI, jrius 
sources where full-te^ct 
cIocu2:aat:3 csin be obtained. 



Ssee r<5spouse as for rs- 
latloftship type 



typa V, eKcapt £ull-teKt 
dccusi3:Qts are not iield* 



Same as for relatioasMp 
types I, ZI, & III. 



typ3 VI, e:xceFt full-te::t 
dcoutaaats are aot held. 



Syatem holds only the 
ir-deKi-ag Inforaaticn aa 



Qata EleiKer<td 
To Sa lajTut Processed 



Sas^ for relationship 
type V, 35:c^pt for con- 
vert ioa of iadeKing la- 
fors&atiou* 



Sass as data siccent coja- 
b.ltiaticas hold by eystea^ 



Sena ae for relationship 
type VII, escept for con^ 
vera ion of iadexing iu- 
forsaatloEip 



Input Processing 
Scquirtiiaents 



Same ae for relation- 
ship type V, except 
for converaion of in«* 
dexing information to 
permit merging with 
master date base. 



Same as for relation* 
ship V, except that 
reproduction of fall« 
text documents need 
net be considered- 



Same as for re 1st ion- 
chip type VII, except 
for conversion of in- 
dexing information to 
permit merging with 
master data base* 



S£»3e as for data eiement 
combinations held by the 
system. 



Copying, character code 
translation, record 
reformatting, and 
mersing (with notation 
of source) to provide 
a combined vocabulary 
list* 



Satce response es for 
ralstionahip types VII & 



System holis only tba in- 
dexing information as 
assigned by source, 
(typa n). 



Saxe for dcta elemesrt 
coabinations held by the 
8y8t;em,(type IX) • 



Same as for relation- 
ship type IX. 
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etoroQe devices range cme the foUosric^ letrdls: 

(1) tSKoediate raxtiSom acooee, sMccQtle -tms^ tKccsory- tjfeeffa lafcssjaCioa 
l8 traasforaed, iB&alpttlate^, aad oaecuted in ths central pro- 
cessor* 

<2) A ras*aaa access bJickios store for ini!orsiatiott ficqqosncly aossSed, 
anfi raqujlrioj topid ©zcftCslMllty, such ea the syotcn dlractory 
es& andeses to infocaaticn :at Icser ieveXa of accossibllicy, 

<3) & ccttj^ca aectts'3 ficvloe with e eJcwar access spsca ^felffli may be 
«8e^ to store the d^Jtacont rccordo costtaloiQs baeJc bibliosrapblc 
Inf orsjatioa swsh aa acceesloo, tsac^z, authcs, tisie, aod av&jects, 

<4> IcTOl etorsge tandia avsTa at; oagaotlc tapes way bo 

Gsad to fftose tbe docissEsat ei-awcgeteo or eSse dociaaeats tSiaEisalvea, 
coDtaiaios ouch fcJEsiij^Q so. tSia abstract or ttze full test of t&e 
dccvs2s>at. 

In :»ddltica to tSi© blbliogcapfcic data elea^t^, tbe data base ccntaioa 
infonaatioa reqcSrad for t&e operation of tlie center itself; f:!is pr<^aao axtd 
acfiiva eotttinsa t!ist reopOTd to vx(&s inquiry sfs3 Ssz^^leasot tfee vaciais eearcb 
tecSmtqiiea, T5»4o categosy ioalludaa roytlaes that psrfcna tha cost-awousstlng 
and oSfactiwaesa evaluatica o£ tfea syataa, thus permitting dee^^ra and edninie- 
tratora to tscaificw cyetaa psrfo?ii»JKs ea3 <j1stftia icslghta into poasiblo perfcrmaace 
criteria aad uyetca laq^rov^SKsents. 

2«3 MODE? OP P8M JS3pftjtr£I(K? JHO TCS X,I??n3 S?$TEM 

Initial iQspscfcioa has idestif led five basic Bo5es by «blci» data con- 
taineA la» cr e55(nrftste(l frca, external data fcae^ffj could be qveri-^d ?.n crder to 
eatlsgy ths nasids oS EKJCS ol^.etttQle: 

(1) CoEibtnsd 

(2) Separate 

(3) Ssncte/Oa-Ilca 

(4) Beaota/Off»tls!e 

(5) I&t^ated 
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Theae five vaodBS of qvsery pleceoenfc, when used in ccinblnation with the 
previously defined types of relattlonQhips bett;een LIKCS responses to clientele 
and LIKCS data eleoent holdtnge, provide the moans of defining basic alternfitivcs 
of aystcaa interactions 

2.3.1 Ccy^ined 

In the c?>2*ined iccde of quory pl^^zemeut, the system jousfc conrorc 
ectra^ted portions of other data bases to a cosaaon file structure (separate 
froaa tWCS crm data base)^ to a conam computer ntediurn* and to a coaaioa format, 
so that all the extracted poxrfclons can he searched wit±i LIHCS soSft^ssre a»3 hard- 
ware but by the use of different terainolcgles* ISse advantages of such an 
approach include: (I) UIKS control of response time, and (2) ability to search 
all areas of the extracted data ba^e in one operation vlth otse set of soff^are. 
The profcleos of this approach lnclu<:e: (1) difficulty of detemining the current 
relevant portions of external data bases, (2) the oa-golcs maintenance of several 
casversion procedures • procedures ^Ich are dictated by decisions not under 
LITJCS control, <3) finding persoimal to hoodie the many search terminolosies, 
and (4) l2^5flcloi:^2y of isKltixjg soaig? search £oriRul3a& for a single request. 

2.3.2 SePw*>ratQ 

In tha 85parate txsde of <?iery pleccsieat, t&e syotsa cc>aver£s the 
esrt:rcctsd portioaa cS ether data bcsas to a cosscoa cca»ptst:ar sceSitta at a eiagla 
locftticn* so that each data base caa be etessrched only by using its appropriate 
softtpscc esd icdealijg teralaWogy. Hie arhrssato^ea of auch an approach icclude: 
<1) loss iaitial effort required tHicjo. In the coabim&i aafl latejsrated tsodas to 
eatobllsh an operatioaal csnfter, and (2) LIKCS control of roopcnse timot* The 
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probleaa of this ©pproaah Include: (1) tb.& difficulty of <!etena£tdng t!js current 
relevant portions of ostemal de.ta basas. (2) tbe o3-goia3 ssainteoaace of ss^ral 
«'lf£or6Bt ooftwero teralaolosy and file etsuctaare pescItRses which are suS^ect to 
chenge beycad Ln3CS coatrol, (3) flfjfilas psrsoanel to faandle ths msay query 
lantjaases acfi seorch approashaa, (A) the ioefficiency of wfiifiSzig many search 
fcrKalae for a elngle request, cad (5) (2eter2iiGlns a reasonable senrch priority 
of data baoes to cbtaln aexisnsn retrisv^il off iciercy. 

In tfcio node of qyesry pI^TsaEent, ti)ie syetera searches aa external 
data base via rea»te, <Ki-line con^Arala, k^Iiig the estercal syetCT|8 iadcs ter^- 
icolosy, softV7are, axv^ hsr&raro (fiscept for tersdasln, ets.)» Tee aAvaatsges 
of iSsla ^^roash rjto eh.?t IISCB fp;s£? m f?M-?tsis.5^5e fcso&a, and the eaGfc:© data 
bag's my be searchjia wifthoiit prodfitesainisjs relcraac areas. Ca the other herd, 
LIKSS has little control crrer fcho eystaa. DifficulfciaD arise in flTj3izg search 
fCTEBlatora cepeble of haadllEg the rrarltty of techniques etd langu.-iges required 
to eearch laaay dzto. ba(363. E^srJ.eooa with the Wevrologlcsl lofoCTiation Rae^york 
hc3 ehc5jn ihs.t cKicg t?7o cr core voc?}>ulsries covering sfedler ffiaterial resales 
in nesativa irvtcrSereccs so that one cereon carmot efficiently hoodie rcore than 
cna search 5tratog7, S^^jrch prioritrion srjsas data bageo vi>nld bfi <3lif:?vc:^l*: to 
ccSebliah. In ad-Sition, csi-Itea P»:c-;ch Cv^pA-tilities to? bil^ssicgrai^Bic ir^o^.a- 
tioa tj.-we not been ccsapletoly ref iasd and will profcr-ibly not be realizo^ile for 
seroral yeasra. 
2.3,4 Re3?ot'e/0f/?-.!:,ic9 

la© rsmoao off-li^'a mcde is siciilcr to th^j vs^otz on-laae eck^c; T^cscver 
<p»rie8 ara pieced by tnail, TM, telephone, etc. One significarit c^Tsatage of 
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thio approacJi le that each dafca baf*e t?ouM be searched by perconnol familiar tTleh 
the syotCT, i^lch vooild have a positive effect on the output c In additions £h3 
entire data base could bo searched T?l£hout prior detemlnatica cf relevant 
porsioaa* The apj^sroach aleo freea lUSCS cf jjoainteoonee resporzeibllltlee^ Hie 
probleaca of thle approach are tHot LIKCS has nc q^isltty cor.trol aod ao control 
over respoaae tizao* A fizrther probl^ erlecs In detennlnlng ^sbtch services to 
tofc^rrogate. 

Ic the Integrated cjode cf query place^mt, LIECS ccnrorte e^itracted 
portlOM cf other data beoes to its cwa fil^ structure, completer cscdixsm, forciat, 
and iodsxing tenair^logy, thereby perTdtting the a^ersicg of such data into 
LIWJS* 03?tt laactcr bibliographic data b«se for veax^in^ stx5/or asmoiiacesssent. 
The approach hac cany acix-witasee, inalvidins: 

(1) Relative ea$« of oy?t^. ©aitatoa^nce throiigh use of tahlss, iBfalch 
perjsits Gcceptann:: cf dJveree Ippizle and prcScctioTi of diverge 
oufc?;;te9 In 0ddifcio;^-:> chcp^ea in external data bases over which 
LIHGS h^flr no coitirol ccuIS be MCCTscda^ed by cba^alx*^ tbe tables. 

(2) A einglo ijusry lonjisage, t.^5^i::3 a ccsKprehonsiw ssarch that 
i?ill esTtract cnl.y relsvawt data requireo a gre^t desi of ezperiersce 
^rlta bcvh the vcrJobylcxy and the daca base, The use of a single 
laasxrage anrl data base steplifies the prcblsa. 

(3) Co::£rrol of response titre, 

(4) A?>ility to eo^'ch ^jtl creno of tbe data br»ee in one cpfiratioa to 
retrieve maximl relev£nt informatics. 

iSiile the fidvrat?^s^9 cf this approach are great, they mzy veil be 

out5CQ;,ghcd by ia>e prchle5!So rtefcerm?i.ng ^st constitutoo the relevert porsior: 

of Bsx cactoraal data baee is a difficult t^^alc, pcrticnlary t£»heu that portion is 

to be used for deaand pearchos. Designing a ©achine syntea: t-j p2rocec*s nmy 
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diverse fojaaata, yes still prcnri&e fleslbillty aaSi case of maintenance, io alao 
a coE»lez taefc. ufclch a^^o a large isitlal cost £o the systesi. Ifco s^st dif ficul 
fcesk In sypcesao of tbla fcyps (as evidenced by /iUE8B5C3*8 esporience dtb t5?,a 
Bsujroloslcsl Infojsactlon Rrtsraife of fth'i NIK&B) la to resolve lacaapatlbilitieo 
oS t&esaurus end teSesrins apprcecT^s. Darolopicg a eingla query language and 
*;eebolqae that rill apply to a ccnsolidated data baao requires a pubstatitial 
ia'cellectusl effort. 
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3« 2^:nM^^W-JS^'^^ri^mi.Mh^ yM>km^m is lincs 

Sfis ccsts of cyfjf-e^ja dcsiga, progracsdcs, and prog^reo ciaintoiiance ticva 
rilstorf-cslly been a vary Inrsa part the costs of deveXopiog and rvtmaag a 
nrocefisifis or f.afoTa>atloii center. In «n effort to reduca tisle cost, vzvlo'ia 
approsshea have been ic^lensntod for gcssralizing data processing fonctloxxs 
fcivctvad In a tiasa canter. Sils ^ssifcilsstlon of funetioa suiii as data Input, 
Btoraga, rotrio^l, data file toelntenflp.cfts &sd reporting I0 collectively VakOim 
as a ^sroltzod daca Ejsacgaicaat ayetcn epproads. Aa a rule, s/oteirs aro dafilaed 
ljcprecise3.y at e559 cjitaet sad cast uiaJarso issdjlfication to meat usar oeeda eSfsc- 
tively. 2ie tnt-ts cstcare of the data ead the aeGds of tiic rsera are detoralnsd 
only thrcuch cjipsriossca wicix tfee s^t^aa. Ftartaerearej the needs of the ucsra 
ch«ag3 wifii trJjKS co tAet rspnt^.ei^a e&?5:-2C;S.ca 2.6 necs^essy. 

Tiio LIKC3 file csacage&'.nt syatesa cbozld c.ls5»ijlfy the develcoctaat aad 
fscitSlcRttcc: cS prcgracQ eiid cspsdlto the sciutSoa of users' probSeac. 7.k.o a^Aievc- 
aaat of this g<jal uill r(K3«cc the cost of a>noeaKsctlcn, iKxiifyine, essciitlns 
progrsos and eclving ooers' psrcbieaa. To achieve this, the 5of&:2re oystsa should 
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introduce the ability to iaprove the LIJJCS file management flsxlbllicy by pre 
vlditig on opttouin approcch to the followiag LKCS objectivas: 

(1) Injmt frcTO a variety of sources Ineladiag both local 
Iceyboarding and n».di4na-readable records cre-atsd by 
otjier orGsnlKatioas, 

(2) S5u£put to produce printer priaiary aisl aecondary publicatlooa 
with opticaal iadsxes, 

<3) Ability for dimcaivitioa of mchitjs-rcadable records to 
other publication and infonaation centers, 

('^^) Storage aad/or output of nanagarlsl control data. 



PersEsnsnt otorage of tho data for pccsible later uaers in 
a recric^ral and dica35Di,natioiQ systcia. 
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The 8ofte?are associated with every computing system can bo brokon dosm 
into a number of distinct levels with r-spect to tiia dietance between the laodule 
in (»i«j3tion aud the coianutlus hardware itozU, Sffective utili^atJon of the 
computer tn any infonratioa pyacem depends on having a nymber of modules that 
can bo identified as the operating nyitcm for the computer. The operating fjystem 
conaieta of a number of diotinct icodul<>c asoocictad with the !DP.n3gen>ant of activities.^ 
in ths conjputor, the allowtfon of crr.puti:i5 resjourcec to jobs or users of the 
coirpucer, and the cecrxexizy.ns of varxcuc jcbn and tsflko. Another inoortcnt dynamic 
function of the operctins ayorxin if« tho fltorege end rctr^ex:^! of data on various 
lav.2?.o of p::Qca^ de-'lccn RoscctateC wit:i tho con^ut?.n3 S7a-:cm. 

Tiiene 3jr,cc?:cnri rcnc-xrco rj.loiaiiion, flct:'viC:y iaanac?j«T?.t, end ar.naso- 
mont of data on tho otoraaa dwicoo constituta the fo«ndat;on;;2 eleaante of an 
opcra;;in;> nyctciu. Ihoy r.ra foundf.tionai 1>ecaasfi they dynanically iyteracf. vi'cfx 
the nmnins progrsm that aases ua© of thslr cerviceo end because they are maUexxt 
in the comutinc cyofteTi'o nocory end therefore, can be called upon to service other 
functional eleSseata which aro nore application oriented or Icoo dyr^oic in d).elx 
Interaction with the oj-ntcta, 
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la addition to these fouiiSational eleroats, a number of other functlOK^.al 
tnodulcd are included in the operating oycteja. Tfacsa can be viewed as ^uper- 
structural elesjents. One cSaos of eurer-3tructurel elejnante are the dynamic 
elements assocletad with interaction ^7ith t^ie user ©uch the console oonltor 
or botch supervisor, if it is a non~lat*5ractive systen. Another super«»striiccuz*al 
category includes the progratn deTOlcpment tools and oorvlces. Tnese iacluds the 
assembler language, txanfllatore, the cocnpilers, the linkage editor and the loader ♦ 
A third citegoTry of tha super- Gtzuctural elemants cen be termed the oyoterc 8u?r3rt 
jobs. These are routines that help the user use the system, but are jobs or tasks 
in the same scn<?e that tiie users' prosrsj^? are jo>c or tasks for the sy8tc2m. P.outijiei^ 
such as library maintenance routines, datri definition routines, job definition 
routtees can all be categoricf^d ao system support functions in the operating sjstem. 

In addition to these elemnts of the operating system, vhich in soace 
^ezi35i eispj^ify tfao users ^ inter facA t^itSi and uttliaatica of the hr^rd^^are, tiiexe 
are a number of generalised jobs that are not uniqtie to one particular application. 
This £attor group of t^X^jmrtzn ipciu4e3 sorae current sof tiaare cystrtms teo'^ra as 
data manageiaent or file man::scs2ent pyst^ytifj, report vrcitcrc, query oyntem, in^or-* 
mtioa retrieval systciOiS, ei;d docunrcnt procesning systernc. BccauGe typically these 
aystena rxn not fully intes:cctoU witi t'le operating sycteia, eaoii presents its etfn 
peci*lic;r input: l.3.nguage, uso**r langua&e, rxi^. constraints of crvcration. 

one snch software package called Docuaicnt Processing Tysteni (D?5 cf IBM) 
would neec at first glanc-i to vzx-j sppropria-.e to the LI!!i;s t^^e of appliftaticn. 
DPS ia orien'red to docuLer.t£:t/.cn f:ysfce.T3 that provide references. It accepts the 
definition- of the fcrni^^, of n record vhic'> rrpTe'sents the refereace^-prov^.ding 
Infornation for a sin£i.c docunent.. The record caa be indoy.ed and queried by the 
subject matter so that the uoe-; c^n receive th^ list cf docitcient references, 
dccuBSSnt numberf? or even abstracts of the docu^jaits* Yet, DPS prv^senta a ocrioua 
contraint to the user. Tne files of DPS mat exist on-line in a single volunse on 
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a dlec storage devlcot thla limits Its effectlva use to relatively etaGll data 
collections* 

Several lerge-ecale generalised data u^.jagenient systen^e bave been under 
devalcpment for ocsce tt*a without fully realizing th ^ .nitlal goals eat forth 
by their designers. Systems such as iBM'o GIS, AUERBACH's DM-l, IBM's Infonnatlon 
Manageaenfe Sj^tom (o:- IMS), and MZTSE^o can be Included In this category, 
lie shortcanlngs of these syotans end the cost of their dovelofffient can be attri- 
buted in pert to the difficulty of integrating them vlth existing oi>eratln8 syetacis 
supplied by tha manufacturers. Operating systems such as OS 360 for the IBM 360 
series of coajputers are very complex; they represent the high overhead In the 
mcunting storage that the user tmst devote to operating the system functions; and 
tiiey are difficult for users to modify. Indeed, user modification Is generally 
l;4>ractical because manufactxxrers froqucmtly change these operating systems, making 
It difficult to maintain any one special version. 

The long-ranse trend ' i the davolopraent of operating systems and data 
mnnngwent systems 1© toward the gradual integration and sophioticstion of the 
data managenient functions which arr integrated with the fouxvJaticnal elements of 
the operating system. Structurally, the data management systecai of the near future 
will exhibit a dlstj.nct hierarchy of functions. Close to the fctuidation of the 
operating system wil7. be such functions as those thr^t retrieve riid store fl>icd 
blocks of data from the secondcry storage devices end move them into and cut of 
the computer system's pritaary trsr- ?y. Itore sephloticatod levels of data service 
support to the progr.nmDu?.r vlll be built on those Kadjtne^prientod functions. Ihese 
will provide the cbllity to bandJ.e variable length otreams of data, to build and 
manipulate linlxd data structttrno, and to define and Toanlpulate files of infor- 
ms dLon to be stored on secondary storage devicoa. 



Ihe file management system for LINCS exhibits characteristics of this 
last category of data management. In LI1?CS, the more primitive types of data 
laanageffient functions will be built into the operati— system and will provide the 
ability to define tho filao and indexes and retrieval strategies suitable for very 
large files of ref enc«-provlding information. This type of data management 
system- will be characterized by the ability to define a structured vocabulary 
called a thesaurus that will align the vocabulary of the untutored user to the 
rigid terminology ox the tndexers. The user will be able to conduct a multi-stage 
dialogue with the system during which he will learn the vocabulary representing 
the areas of his interest. Once having learned the proper index terms and their 
generic/specific relationships and perhaps having learned which terms are syno«r-«'S 
with others, he will ba able to formula -e a meaningful inquiry to retrieve the 
desired information in a discriminatory way. 

Another trend of future data management systems is the gradual standardi- 
zation of the language used to describe the data structures In these systems, ihis 
category of language is known as data description languages or DOL's. Using a OBL, 
a system designer might describe the terminology and structure of a file he wishes 
to define to a system, or to transmit from one system to another. The use of a 
DDL and an appropriate Interpreter viU enable the designer to create appropriate 
directories and to transmit information from one center to another in an intelllglbl- 
maimer. 

One important trend already apparent in the field of scientific docu- 
•oentation is the creation of a number of specialized information services and 
ranters and the attosjpt to create mechanisms that permit their interaction as a 
network of information services. The use of data comajunication lines between 
centers and the adoption of uniform terminology and thesauri support thto trend* 
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AUERBAca is currently engaged in a project to define the capabilities of the 
Kational Agricultural Library for the Department of Agriculture. It is anticipated 
that the National Agricultural Library will be one auch center in a network of 
inforrcation center©, consisting primarily of the National Library of ^^edic^A^Q, 
Ihe library of Congress, and Ihe National Aericuitoral Library and their adjuncts. 
Other candidate infomuitioa centers for inclueion in this network are the Bio- 
sciences Information Services cf Biological Abstracts, the Library of the United 
States Atoadc Energy CcasRiesion, tlie Ciearingjfeouse for Federal Scientific and 
Technical InfortEation, the Library oi Congress, tiie Kational Library of K^sdicine 
end liEDLARS System, the Inotitute for Scientific Inxornation, and the Chemical 
Abstracts Service. Logically, the LINGS system should be one such information 
centar iu a national network. 

I'fcch of the research devoted to the problems cf designing an xnforanation 
center and creating a network of such canters is certainly applicable to the LIKCS 
problem* 




■ 5. EVALUATION CRITERIA FOR GEKT.RALI^ED FILE MA t^CEMENT SYSTSIS. 

this section discusses the crireria for evaluating file mnagetnenr. 
systems snd techniques pertinent to LINC5. Wiese criteria ciay be c^ppUed to 
file TTuinagement systems rccentl> developed or curret)tly bclag developed. 

iJUERBACH and other agencies have alr?:ady conducted sevp.ral surveys 
sxkd critical *ivalu^»tions oi: data managem^^iat; systems, Tliese ax'e lifted iu the 
Bibliography included in this report. Some survey;? have concentrated on 
«:atulating & number of, feature?- and parameters which file management ^tystcms 
may or may not i^o^^sess. Over 100 s^uch parameters have been tabulated in 
:fepo3:ts by the Fry *:t ^1 (cee Elbliography , item 2) and the Codasyl Coa-?^i rtee^ 
Item t) » Comp^-aensive listings of features such as those tabulated can prosent 
a rather bewildering array of factors to be evaluated^ Thec-ie factCiis masc be 
placed in proper perspective in assessing their pertiaen«:e in the LTNCS sysfem. 
The difficulty in applying the parameters appearing in prior report'? slem^i frora 
the fact that each previous report addressed a problem slightly dlflereut from 
iilNCS, In the Pry study^^^ and in tba Codasyl stu^'.y, only existing gtjneraiiat^r! 
data management systems were considered. Furthermore, the studir^s did vioL 
consider combinations of file management systems that did not fail strict!', intc 
the category of a "generalized d«ita base management system." 
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the features of a data BumagesKnt system that relate to the tlHCS 

pxobleai will be discussed In this section from several points of view. In 

effect, theie features represent the criteria by which the systems can be 

•tudlei and evaluated. These features are presented from four points of view. 

(1) De?lffx Obtectiyes .- Discusses the c»7erall go^ils of the 
system, without regard to the various ways in which these 
goals may be re«li«ad. 



(2) OyWOtoff' - fil8cus8*s the various system functions and 
cpaMllties for aecomjillshlng ttie design objectives. 

(3) j^te^yel - iHlscusses the system components (i.e., tables, 
d«* sjructutes, and ph>gram module structures, their 
^SJdottT i«»te«*lationshj!»s) used to perform system 

(4) Lanffiage Element:)^. ^ Disdusses the systto counands and 
service calls which may provide ak Appropriate interface 
witn the system users and prdgjJattmei'lii 

All of the software elements of the LINGS system can be evaluated 
from these poinu of view, these elements Include the operating system with 
its machine-oriented job management and data management aspi.cts, the programing 
language compilers, and tl»e console monitors which allow user interaction with 
the system. 

Bie Pile Kanagement System (FMS) features are sunoarized in Table 2 
and ai« discussed in the following paragraphs. 

PSSIGH OBJECnVKS 
5.1.1 Responslvenesfl 

2he primary design objective of the FMS should be system responsiveness 
to user needs. Dser functions are discussed la Section 5.1.2.1. To the extent 
that the user deals directly with the EMS, it should be easy to use and learn, 
ae IMS must provide quick response to service requests and rapid handling of 
search and update operations. 
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o Ease of use 

• Novice traiixing 

• Quick response, search^ and update 

• Ittdependenee of logical data 
0tr»cture from 

• locZependonce of logical data 
structure and program 

• Ability to cicrscibine data In 
imf oreeeen 7;aya 

0 language and comarid def Juitlonal 
capability 

Bf flciency in 

• Block utilieat:* in 

• Data representation for storage 

• Ttideslng arrangcrrcats 

• Retrieval strategy 
9 Updating methods 

0 Sharing of ccmaotx data 

ag».liabilit y 

• Control of autbx>rised occeso 

• Error roccvery 



e !}sta baf^e 3tr?ic2wrQ aitd aiae 

o Variable wreus 5ted b2ecl< longth 

• Intro-block structure and fonnat 
0 Data iinltage 

e Priority ordering of data oefjtaants 
in files 

Sygtm ,S troc&?gas 

Logical data dlrectc?rio0 

• Data file dictionary (itea nzm 
AlcticiHry) 

• Xndoxes 

4^ Access rl^ta table 



Pfter FunetloQS 

0 Query 

0 Editing 

0 Updating 

0 Report generation 

0 Prograto entry 

0 ?ro?jram execution 

0 Kovica training 

Interface wltA 

0 Uoer 

0 Program 

o External file 03rstem 

0 Job raccsgcnsont eystera 

System Puncticng 

0 Translation of data values 
(input, output^ and storage) 

0 Beta b^jsa updating 

0 Directory updating and indexing 

0 Data search 

0 Data search look-ahsad 

0 l&intenance of data usage 
statistics 

0 Uoer accountability 

0 Epckyip and fallMrc recovery 
(job and data rcctert points) 



Uger Tjanguapes for 

o Prograra specification 

0 ^ro3raa» eyacnticn 

0 liatfli definition 

0 Dp-t" entry 

0 Qaory 

0 Output format specification 

Progri^^ ^r Lar^ yiagos fo r 

o J}ata updating 

0 Data retrieval 

0 Report generation 

0 TasJt calling 

0 Control transfora 



STMIcy iR^ (coat.) 
Itga Stxttctura 

• Logical subSivlsloo e&d relatlooa 
xiacmg user itess 

• 03grae of neatlng perstlttcd 

S^tosi groCT^ MsxZuIea 
0 llDdularlt7 

• Generality of progrea functions 



lAKGGAGE ELBMSNTS (cont#^ 
Syotem Lan^isgeo 

m Cata coding echcoeo 
0 Xn^^ut/outout fcrcutcs 

• Interface with ecteraal file 

• Interface with JMS (Job afeaagemeat Syt 



5.1.1.2 A daptability. The FilS must be capcible of adapcing to a v?idc variety 
of user needs unci eiiviroumental changes, la order to exi'.eud tby usel'ui life ot 
various parts oC the sysr.en, and to minimise the iuiplications of cbans'^^^.* ^^'^ 
iogical structure of data .ohould be kept independent of both the P«S and the 
iising piogrsav). 'Hie system should also be able to combine and ui^e data in 
unforeseen vay"5> so that the data structures and organixation do ttot rigidly 
determine the ways in which data may br used. Finally^ the user should he 
allowed to define his own langc^ges and coom^ands to the system to accotnnvodate 
jjpecia)' needs* * . 

:i.l.l,3 Efficiency > If th^ FMS is to ^neet effectively all r.he deraands placed 

on It. ope>-ating efficiency is an itripcrtaut factor* To vnal^e maximum use of the 

available storage', tnclhods of rcpvesei^iting data for storage and methods oc 

utilixing the apace vithin ^^.^la blocks should be carefully considered. Data of 

interest to more than one user should be capable of beir»g shared, vifch proper 

arteatio*^. paid to protecting the data aod^ vhers necessary, providing control 
« 

C'Ver data access. Indexing arrangements are probably the crucial lector ia 
d-jttertPining the speed and flexibility of accessing data. Data retrieval strateg 
ond the O'.cthcds of updating and mrtintaming the data base will also play key 
x-olea in determing sysl:ecn efficiency, 
5 * 1 • 2 P£5iI2.Li22S 

5.1.2,1 VJL^^lJ^nc^ieill- Vhilti the ?yS user vill. ordinarily be cn individual, 
task programs r.iay also be rjns Herod users > inaamuch as a task prog?'am may call 
on scrvicci; provided by the R(S, For raaximi>m flexibility^ both kltids of useru 
should be abli^ to call all FMS services > 'althoagh the appropriate languages for 
doing so need not be tha sarae. 
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Sicca ipejrying or oli«:ais?4ttg infoiacatlon frcm the. system is tho 
primary neer function, tbs query facilities aro e:ztteB^ly iirportaat. ^ete 
Bhould be a variety of vays of specifying confiitlcmo uadsr vhich data 
wanted. For the user xAo is ttot ifttisBately fCTdliar with the data base, it 
vxtixld he lislrful to 6iito a dialog query cepability, in which the U3cr would 
asl: a aeries cf lacreoalngiy apecific questicx^d^ each based cat the results 
cf «2re previouscisujs, until the driOirad iteia yas fcuad. This dialog could 
lilso be the chiaS cssaas of traiti3.ss novices to the use of the eye tern. 

Other user functions that ©ay bo called oa by task prograass as ini>ch 
as by IxatufXi users include editiog or arraGgiug iafonoation for some specific 
purpose apd^tlDg tfce dQt^i fc€i»3, and generating rejorto. 

5^1«2«2 3Datc r%€a3. The RIS occupieo a rather central posiclca la the system 
oi;^ it intorfeces witSi humca ufiGrs, task progreias, asad the opera tics oyatem. 
Eatfsver, eiace the iM3 isolates as well ce cosnects these subsysfcesus, chaxt^cci 
€o osie euhsystcn should bare a ciainal effect oa the others. 
S.1.2.3 Sv/^fcs^ !?tincj::gor.^. 52ie ayotem XtitictloM are tha huilt-ia, intrinsic 
fteitctios^ ef the JSSa end bear a large part of the responeihllity for achieving 
the deaign oltjectives. Flrotp th'sre ore the ftincticns dealing directly with 
datS:> iscludlr^S tr^emlciti'zn of: Aita values for in^^ito otttput aad storsse^ d^^ta 
baoo t^pdatii5gj) em dnta retrieval. !Eien data retrieval is sequential or 
pc?:wer2i£d 1^^ cc^ne way, retrieval efficiency should bo increased by performing 
a locfe^cSiecd er>sr^;ition in c^njcactlon vitfe the I?H3. ET^^nt, .thoro ars supervisory 
data fttGCti;?S35 is^lu^t^s dSrecCory updati^ag and data itoaexing (vtliylk chovld be 
dona automstically itoer?7er the d^ta baaa is cfeacgcd), atid ttaiatenasice of data 
usage statistics. T&eco statistics may be used to roorganite tlie data ba^e in 
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d more efficient mannor, either autciuaticaily or nEnuelly. Finally, fcfce EH3 
ehould pl«cr scane role in keepius t^ack of each user's use of the s^rstein ex$3. 
sliculd provide backup end failure recovery facilities tJirou^ job aad dcta 
restart points or otcier Kear?cj. 
5.1^3 Str^icturp s 

5.1 .i.l Gsq^ral , ,,Cot a tders ^ j^^^ ^^ , Variouo geaj^ral scructitrnl coi?3iderGiions 
of feet C3ie dssigu of au rKS. The follcwins structural features will be 
ccnsidercd: 

(1) The orgcmlsatlcn of data base; 

(2) Hie expected size of the dfita base, and its teplicatioa 
oa the 8y6t5!2i deaigp.; 

(3> lie lerr^th (fixed csr variable) of the dafux block exchanged 
betvceen the 5113 and the opera (:ing ays tern data monagsmeat 
fuactics; 

(4) The logical structure and physical foraat o2 the data blcckj 

(5) Tho orderir;g principle usod to arrange dcta sesjneats within 

(6) Xhe facilitlea vhich should be provided for data linkage. 

« 

eystea to describe end provide access to tie data base, "niese 8tTt'.ctures ixcluec 

togigy-l Raf-Q <?.lTacg?rle3. tjbich describe tlie logical structure 
of 4aa dGt:s ifceas a»4 t^ieir logieal position ia tha data base. 

<2) S3S;?J?B« i&^-^S^liiL'll9S?JEiS2» w^i^ch provide a croso- 
r«f9r«Tice h>©twtieu ciis eyKt-^llc item names uced by tee 
exteriial user o::.?! the etnictaral or o fixer codes used to 
idsaeify tkc Jt<^ iat«i-aaily. 

5-7 



(-) vklch tell ishare la the data base certain 

<2a£a values nay be £;Q»adij thereby enabling the syctsa 
to porform accrchoa without acceaalng t>.e data itnslf , 
or with ralaJ^a^i accefloiag. II\o typa and epount of 
ifeadeaiEg arn ceiilcsl ia ^etaxminkag t!:?. eystem ef^estive- 
a«08 Biace sacrch tima cen be ciainizetl tbrci;£^ adequate 
ajvieirin?, -.shile crwessivaly datsilsd Inaexing requiraa 
much tt-ue end cpaca for niai'.otaining fcha Jiide::o8. 

^'^?i;££3__Sl5^j.a_tabJj^ vaich enable the HS fto hsep track 
of «feich u-sers ci.'o a»jti'.orf,s«d to nccess each portloa of 
tho data tase, ars5 for vhat purpotses. 

ether 6yc«:eTi structures, euc!> es e list of active tesla; aa<i their 

a,cc requir£ineaf:8 a^i-i a list of naeze ead the estont to which they use eystem 

facilities, ojsy eloo be rsquired if the K3S la perforElng Job tneaagctaact (Euactioue- 

5«1«3.3 Item^^puft^tuijEi Ztm ntiricture refers to the logical structure of 

individual data Iteica. Iteas era divided isto suhiteais, \^icb nay be further 

suMivided aad ulticustely t^ivideA into fields or values. Certain sabatrcctures 

may be repeated aa arbitaafty number of titaeo; oloo certala substructaros my be 

optional. Relstioas aooag iteas niy be eypresocd is^llcitly by the logical 

nesting of iters withto other iceas, or explicitly by means of directories or 

various kitids of dnta lialage. Cis degree cf complesity perciittod in item 

structure is Ji!i«>ri:.act. Allovinjj arbitrary cnKjilextty ^<dll eatcil a certain 

overhead fa cyctem dsTralop^c^at costa apd rtrcairjg tJrae, but ir^y l>e Justified 

bacswse tJie eyeteia vill be lees subject to char..«:e evialnz frcn a n?ed for data 

sfctuctureo lacre cocrolcx ftSvaa thoc-e orJsiaally envisioned. 

5.1.3.4 S^ston^ro^r^JJa^uI^?^ The orgQafsaticn of. systsa ptogr^crs c07J^ra?lag 
tJjQ rSfS mst also ho cr.«::More'5. Theso pr«?sr£T5 nhould be as ffxsdul^r C3 poseiM.*?! 
in order to fecilitate iroplecentaeioa, debugglns, aad dccuxcataticn, aad in 
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order to misslmtze the e<:5scto of cbcnges. use of a standard incthod 
for program interfftce scill coatritmto to tb?ao cnSs. Finally, 
progran fuacfcioas should b<i as gejieral possible ao that ths prograsio 
lasy leiMl tiseffiselves to eses n-Jt oviginally foresaaa end thereby ezt^ad 
tkeir useful life. 
5.3.4 Lan.TOa<:e E lisipgita^ 

5.1.4.1 Sse^jjisasasesj. Program specific? Ion Isjiguaces are used to 
dsfine task prcgrapfl. A specicl lesigimgo my be provided for this pwrpose, 
ox the systsn jaay bo built to accept tte output of aj^ stendard proccd-^ral 
lacguage processor. Prcgr^ exscutioa lang?iose3 call for «is exacntion of 
prosracD cad supply tlim with r^ccscary paraaatera. The user fiujctioaa of 
defining data stracteres and eatisriag data require fip^ropriafts lenguageo. 
taaguege defiaitioual facilitiea bruld be especially bel?f?.ti for t3>e data 
entry function, particularly where Ijsrge quantities of data are iuvolved. 
A query leasungc ia nscsssary to 05i5;ble the user to retrieve iaforsnaticn 
from the dRta base. Bie priacipcl considerations here ehoiild be flejslbility 
aiid the ueer'c sbiilty to oT-tr.ln infcnnntion in opite cf a ISi^itad fmj.iicrlty 
with the dftta bfi!?e. Tee luisr n?i3t aleo ba able t/j sp^ciCy the form in viUch 
ha wacts the resulta pvcesntad. Eence, an output ferrjattlns report 
generation lansuage is requires. 

5'l'^'2 ^rc^rf>.^S2TiJiCn7?^e,Q3^ Prcgramer ?.ar^ase3 aire tT'.oee vsaH by the 
task progrcraior to call oa TtT^ eervicee. The cost important of tliece eorv-ices 
are data updatJas, daia retrieval, eod report 0taarr.tioa. la addition, a 
control Icngiisso is needed so that tasks caa call for the esecutioa oi otbcr 
tftsko acd so efcat control amy bo passed from teck to task aod beto^sen taeks 
«Ml «6ho ccntix>l system. 
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5.1. A.3 System .LftapiQgfia. System languages are those used by the FJIS itself 
\A\en it operates upoa data and interfaces with other sub^ecns. Data coding 
acbcaea conpress data in order to ecve space ei>d alco, possibly, to prevent 
unauthorited access. Also, data nast be foraatfeed kppropriately for input/output 
oparatJxsns. A primaxy interface of the 5J1S io with the operating ays ten's data 
tnanagensnt routine. Symbolic blocit nmes and data blocks theaeaWes are exchanged 
ia both directioas across this interface. Eie miS also interfaces wiiA the Job 
Maaageiaent Sy8t<SD (JKS) in the oparaticg system; the FJ-ffi can use the JMS as an 
interjoediary ia deftliag vith the user. 
5.2 APPX.TI5g;- ?! ;VAH?ATIOH C?lITgg.XA 

5.2.1 Ml73?r^luatioc i3_ 

A nuraber of evaluation criteria for Hm 8ofa:are have been discussed 
in the previous paragraphs. Ihis ia itself does little to solve the problem 
of determining the value of a given PS© in meeting the needs of the LIECS. 
Iteithcr does it deteraiJjie the relative laerit of coc^^eting cc^tinre irodvles. 
aather, it opiJcifieB ubat fact/>r3 should be considered i.n evaluating a given 
systca. Eo concrete procedure is known which can dati?r?a3.ne the value, or evoa 
relative merit, of a systeta. rhin is due to the existence of different td.uda 
of evclaafeloa c:ri£r-;ia. 7>« ccTTtpa-lsffn cf oystcao with different charscfceristice 
(such ao one ubich perforins ocly part of the required cparations in an inflexible 
way to one xjhich perfci-BS all of the operr.tJocs required in a flcKiMe way at tm7.c>. 
fciSher cG8t> nmst be ccce-Epliebsd ia careful tradeoff studies by sitilled analysts. 
Even after careful ea?>lyeio, p-roblesis of tbic type tr^y in Ss.-cz have no ecfinitivs 
solutioct. 
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JJf careful analysis rrvcols that no algorithmic soluCum is feasible, 
one is free to lock for ef fee five hcucsriscic mathods - at le-ncH^ for one v/hicb 
has the virtue of being readily ai>pliccl. We now proceed to oxamine appicachcs 
to problems of this kiwd which have been used or proposed, 
5-2. ^: Hov? Criteria Hav e Eeeii Applied 

5.2,2.1 H§iSll£lSg^ If c>ne a.^aumcs that each ox the criteria for evaluating 
A syztem U Tiieasurabir, then, In general, a partial ordering is established 
amcufj ail eystems being compared. Consider four systems, S, T, U; and V being 
evaluated under three critjera. A, and C. each criterion measuxed on a scale 
allv)v/ing a highest' ijcore of 10, The following result may be obtained:' 
' * S T U V 



i «; 



3 9 4 



2 



System V is uaJfoirnily better than System U for ail criteria so that 
*-here La no difficulty In making a choice. System U can be eliminated froru 
con*;lderation. However^ of the remaining sy?? terns, no one is .uniformly better 
thin any others and the best we ran do in partially order the systems by critcr^ 
In order to break this Innpasse could choose the j^ystcm V7hich rated highest in 
the largest number of criteria. System T would win by that measure, having the 
highest rating in criteria A and B- Another approach vould be to choose the 
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system with thj highest total polat score. This approach leads to the 
selecti(?ix of syotea V as ehcwn below? 
^3 T V 

A 4 5 4 

B 3 9 5 

C 5 2 8 

TOm£ ,12 U_ 

Howaver, If It were daemad that criterion B were (say) twice aa 
ia^iortant as criteria A or C, the scores with weighted valuedswould appear as 
follcvs: 





8 


T 


V 


A 


4 




4 


B 


6 


18 


10 


C 


5 


2 


8 



I^TALS 15_ 25, 

with system T wi^iclng. CbooaiEg a vreigJAting factov for each criterion in effect 
reduces the vector velucd criterion u i - .ar (single vaJue) which asBusres 
the ability to trfl«V5fora the partial ordering into a total ordering. TSia tnost 
univorselly appropriate fcechulque io to reduce each criterion to a dollar cost. 
HCT;2vor, even this jascsura is difficult to acaig^i. For oxcaple, the "cost" of 
a given meaoura of rolisbility Is the prescrn value of the cost ox the series of 
.epalrc that the system is expected to undergo (including cost of lost service). 
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But it is difficult to assess the (nesative) "cost" of a system vfhich has 
<m outstanding adaptibillty to chnnslnj rsquireiaonts o^ data structures, or 
that presents an extremely wall-ecginsered interface with its users. 
5.2.2.2 B eachmrk. P roM&a^ One approach to the evaluation of systens is throu 
the use of benchmark problems. A benctoaark problem is a cotr-plete simulation of 
a situation to which the system is expected to respond. The siroilation may be 
designed to represent either a t-picai demand on the system, a situation of 
extreme stress on the systen, or a sccnariorof samples from a mix of problems 
representins the projections for long-term demands on the system. Each of 
these types of benchmaric problems, *&ea used to geagc the perforuance of a 
system, either in terms of cost, responsiveness, or other factors, represents 
a particular bias but, dapcndins on the system requir^nta, may be a valid 
gauge of system perforoaace. , 

For a system such as LIBCS, which provides an information service 
to its subscribers vhich has «a ttcocotaic valu'i (although perhaps an intangible 
one), a coot messure determinsd from a benchmark problem representing long-term 
demands seems to be most appropriate. Suitable cost co35)onent3 for user invest- 
tsents such ao trsfning and tis« at console should be considered, along with the 
cost of system purcb.^se, op'isration, and m,aintenance. The system which satisfies 
the boachmark pr-blem with thb lowest overall cost is the ons selected. 
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6. BXPJ>IOg?>A?aY 

££2aJ&?5S^22a^^ May IIJSS? (P^vailabic dirousli ACM). 

Iha cbarac tar ic tics of tha Uollcr^^lng cys terns «re described In a 
cowcon tertainology: 



A2A1I 


OOLtre Corp.) 


CIS 


(ib:i) 


XDS 


(GS) 


TS?^1 


(Tnfo Sy3tem Leasir.c Cor?.) 


I?AHK IV 


(Inform ties) 


NIFS/I?PS 


(IBM) 


SC-X 


(CePtfirn Electric/AUERMCII) 




CJrc) . 




(P.CA) 



A bibliography is included. 

(2) Pry, J. P. et al.: Pa to >fa p5g sr-^'^ t Sy atcs>^ ■'^"^ ''^^■7 
ILltire Corp., Jsnuory 1969. 
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Kile report presenta tlie results of a survey of salient diaro.cterlstlcs 
of a representative eet of atato-of-the-art data management systeiasc It le part 
of an effort to identify the otate-of-the-art capabilities of data management 
system for third- generation cciCTuter syotcriG. 

Section I of the report Includes general descriptions of the aysteaa 
surveyed and establishes the temrlnology for logical organization of data used 
in the survey. 

Section II describes the capabilities surveyed and presents the survey 
results in tabular format. 

The systecss covered in this survey are: 





(CSC) 




C-^IJ3K3ACH) 


FCaGE 


(SUSSCUCSS) 


GIS 


(IBM) 


I2S 


(GC) 




(SJ5S) 




(UIPCSMl-ICS) 




(iCtCSY IB»i) 




v^^.'ff/crc) 







A bibliography is included. 

(3) ILandau, K.; GIagej^gOjJj?,lpjgrRphy on El?);.j.c.grgghlc 

(4) rietrzyi;, a, et cl: Zil5j^H^S?!^?t?^J??i5^^^ 
£i^'?S?«JiiS]S» C-anter for^Apnlted Ucguiotics, June 1960*. 
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(5) Sable, J, I^. and J. Cochrnue: pate tenafietnant Syatcmg 
Stwiy . AUESBACa iAoS-Ill-A, Aorii i%8. 

(6> Slohe, T. W. : PaU. j^anRggyAnf. ; A (kCT cs ^rloca n£ Sy e t ern 
Features, TKACOa, 67-3:iVurOctobar 1967. 



6-3 



Er|c 45 



